Skip to content

[공통] sitemap.xml 자동 생성 및 robots.txt 강화#1258

Open
dooohun wants to merge 3 commits into
developfrom
feat/#1256/seo-sitemap
Open

[공통] sitemap.xml 자동 생성 및 robots.txt 강화#1258
dooohun wants to merge 3 commits into
developfrom
feat/#1256/seo-sitemap

Conversation

@dooohun
Copy link
Copy Markdown
Contributor

@dooohun dooohun commented May 11, 2026

What is this PR? 🔍

Changes 📝

  • next-sitemap.config.js — siteUrl(stage/prod 자동 분기), 비공개 경로 exclude, robots Disallow 정책
  • package.json — next-sitemap 추가, build 스크립트에 체이닝 (Yarn 4는 postbuild 훅 자동 실행 안 함)
  • .gitignore — public/sitemap*.xml, public/robots.txt 추가

삭제

  • public/robots.txt — 빌드 시 자동 생성으로 대체

Disallow 대상

  • /auth, /webview, /report, /callvan/add|chat|notifications, /timetable/modify, /clubs/new|edit|recruitment/edit, /lost-item/edit|report|chat, /store/review, /monitoring

ScreenShot 📷

N/A

Test CheckList ✅

  • yarn build 후 sitemap.xml, robots.txt 정상 생성
  • 배포 후 접근 확인

Precaution

  • next-sitemap.config.js는 ESM (export default) — package.json의 "type": "module"과 정합
  • 동적 SSR 라우트(clubs/[id], lost-item/[id])는 sitemap 자동 검출 대상 외. 필요 시 후속 PR에서 additionalPaths로 추가

✔️ Please check if the PR fulfills these requirements

  • It's submitted to the correct branch, not the develop branch unconditionally?
  • If on a hotfix branch, ensure it targets main?
  • There are no warning message when you run yarn lint

Summary by CodeRabbit

  • 새로운 기능

    • 사이트맵 및 robots.txt 자동 생성으로 SEO 및 크롤링 제어 개선
    • 환경에 따라 스테이징/프로덕션 도메인을 자동 선택하여 올바른 사이트맵 생성
  • 변경(빌드)

    • 빌드 워크플로우에 사이트맵 생성 단계 추가
  • 기타

    • 빌드 산출물(사이트맵/robots 등) 및 빌드 플러그인 환경 파일을 무시하도록 설정 업데이트

Review Change Stack

- next-sitemap 4.2.3 추가
- next-sitemap.config.js: stage/prod 환경 분기, 비공개 경로 exclude
  및 robots.txt Disallow 정책 정의 (auth, webview, report, callvan,
  timetable/modify, clubs/new|edit, lost-item/edit|report, store/review,
  monitoring)
- package.json: yarn build 체인에 next-sitemap 추가
  (Yarn 4 는 postbuild 라이프사이클 훅을 자동 실행하지 않음)
- .gitignore: public/sitemap*.xml, public/robots.txt 무시 (빌드 산출물)
- public/robots.txt: 기존 정적 파일 제거 (빌드 시 자동 생성으로 대체)
@dooohun dooohun self-assigned this May 11, 2026
@dooohun dooohun added the ✨ Feature 기능 개발 label May 11, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 11, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 174799d5-7d65-4c9b-99cd-1ddfc08c7f06

📥 Commits

Reviewing files that changed from the base of the PR and between 1e44aba and d190006.

📒 Files selected for processing (1)
  • next-sitemap.config.js
🚧 Files skipped from review as they are similar to previous changes (1)
  • next-sitemap.config.js

Walkthrough

Next.js 빌드에 next-sitemap을 통합해 환경별 siteUrl을 선택하고 PRIVATE_PATHS를 제외한 sitemap과 robots.txt를 생성하며, 관련 생성물을 .gitignore에 추가합니다.

Changes

Sitemap 및 Robots.txt 생성 통합

Layer / File(s) Summary
환경 기반 사이트 URL 선택
next-sitemap.config.js
NEXT_PUBLIC_API_PATH에서 'stage' 포함 여부로 스테이징 또는 프로덕션 siteUrl을 선택합니다.
비공개 경로 정의
next-sitemap.config.js
인증이 필요하거나 비공개인 경로 패턴들을 PRIVATE_PATHS 배열로 정의합니다.
Sitemap 생성 설정 내보내기
next-sitemap.config.js
사이트맵 옵션(sitemapSize, changefreq, priority)을 설정하고 exclude에 기본 Next.js 라우트와 PRIVATE_PATHS를 병합하며 robotsTxtOptions 정책을 정의해 export default합니다.
빌드 스크립트 및 의존성
package.json
build 스크립트를 tsc && next build && next-sitemap로 변경하고 devDependenciesnext-sitemap@^4.2.3을 추가합니다.
생성 산출물 무시 설정
.gitignore
.env.sentry-build-plugin, public/sitemap*.xml, public/robots.txt를 git 무시 목록에 추가합니다.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested labels

⚙ Setting

Suggested reviewers

  • ff1451
  • ParkSungju01
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed PR 제목이 변경사항의 핵심을 명확하게 설명합니다. next-sitemap을 통한 sitemap.xml 자동 생성과 robots.txt 강화라는 주요 변경 사항을 정확하게 반영합니다.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/#1256/seo-sitemap

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@next-sitemap.config.js`:
- Around line 43-65: The robotsTxtOptions.policies[0].disallow array is missing
specific patterns and non-trailing-slash variants so crawlers can still reach
private routes; update the disallow list used by robotsTxtOptions in
next-sitemap.config.js (and ensure PRIVATE_PATHS is honored) to add the wildcard
patterns '/store/review/edit/*' and '/monitoring/*' and include both trailing
and non-trailing variants for the listed prefixes (e.g., add '/auth' in addition
to '/auth/', '/webview' in addition to '/webview/', and '/report' in addition to
'/report/') so all exact and nested paths are disallowed by the policy.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f28896ab-d791-446d-8843-945f31dc1532

📥 Commits

Reviewing files that changed from the base of the PR and between 0ae4a8b and d22af96.

⛔ Files ignored due to path filters (6)
  • .pnp.cjs is excluded by !.pnp.cjs and included by **
  • .yarn/cache/@corex-deepmerge-npm-4.0.43-390ed1175e-c9ac6163e9.zip is excluded by !**/.yarn/**, !**/*.zip, !.yarn/** and included by **
  • .yarn/cache/@next-env-npm-13.5.11-7bdb393acd-2d34ec742e.zip is excluded by !**/.yarn/**, !**/*.zip, !.yarn/** and included by **
  • .yarn/cache/next-sitemap-npm-4.2.3-463254c2cb-8e88c941b5.zip is excluded by !**/.yarn/**, !**/*.zip, !.yarn/** and included by **
  • public/robots.txt is excluded by !public/** and included by **
  • yarn.lock is excluded by !**/yarn.lock, !**/*.lock, !yarn.lock and included by **
📒 Files selected for processing (3)
  • .gitignore
  • next-sitemap.config.js
  • package.json

Comment thread next-sitemap.config.js
Comment on lines +43 to +65
exclude: ['/404', '/_error', '/_app', '/_document', ...PRIVATE_PATHS],
robotsTxtOptions: {
policies: [
{
userAgent: '*',
allow: '/',
disallow: [
'/auth/',
'/webview/',
'/report/',
'/callvan/add',
'/callvan/chat/',
'/callvan/notifications',
'/timetable/modify',
'/clubs/new',
'/clubs/edit/',
'/clubs/recruitment/edit/',
'/lost-item/edit/',
'/lost-item/report/',
'/lost-item/chat',
'/store/review/',
'/monitoring',
],
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

find . -name "next-sitemap.config.js" -o -name "*sitemap*" -type f | head -20

Repository: BCSDLab/KOIN_WEB_RECODE

Length of output: 1409


🏁 Script executed:

fd -e js -e ts -e json | grep -i "sitemap\|next.config" | head -20

Repository: BCSDLab/KOIN_WEB_RECODE

Length of output: 90


🏁 Script executed:

git ls-files | grep -i sitemap

Repository: BCSDLab/KOIN_WEB_RECODE

Length of output: 151


🌐 Web query:

next-sitemap exclude robotsTxtOptions disallow behavior configuration

💡 Result:

In next-sitemap, the 'exclude' configuration and 'robotsTxtOptions' (specifically its 'policies' with 'disallow') serve distinct purposes with no direct interaction between them. The 'exclude' option is an array of relative paths (supporting wildcards like '/private/') that prevents specific paths from being included in the generated sitemap XML files (sitemap.xml, sitemap-.xml). It does not affect robots.txt generation.[1][2][3] Separately, 'robotsTxtOptions.policies' defines crawl policies for the generated robots.txt file. Each policy has a 'userAgent', optional 'allow' (string | string[]), and optional 'disallow' (string | string[]) directives. These follow standard robots.txt syntax to instruct crawlers on allowed/disallowed paths. For example: robotsTxtOptions: { policies: [ { userAgent: '', allow: '/', disallow: ['/admin/', '/private/'] // Crawlers should not crawl these paths }, { userAgent: 'bad-bot', disallow: '/' } ] } This generates robots.txt entries like: User-agent: * Allow: / Disallow: /admin/* Disallow: /private/ # bad-bot User-agent: bad-bot Disallow: /[3][2] 'robotsTxtOptions.additionalSitemaps' adds extra sitemap URLs to robots.txt (e.g., custom server-side sitemaps), but these may inadvertently appear in the main sitemap index despite 'exclude', due to known issues (e.g., #823, #481). Workarounds include custom 'transform' functions or disabling index sitemaps.[4][5][3] Disallowing paths in robots.txt does not exclude them from sitemaps; crawlers use sitemaps to discover URLs but respect robots.txt for crawling/indexing. Conversely, excluding from sitemaps hides from discovery but does not prevent crawling if linked elsewhere.[1][6]

Citations:


🌐 Web query:

robots.txt trailing slash pattern matching /auth vs /auth/

💡 Result:

In robots.txt path patterns, rules are matched by “path prefix” semantics: a crawler checks whether the URL path starts with the pattern from an Allow/Disallow rule, and the “most specific” (longest matching) rule wins. That means /auth and /auth/ are not equivalent patterns; they differ in what they match around the immediate next character after auth. What matches with Disallow: /auth - It matches any URL path that starts with /auth, including /auth itself and any path that continues with anything after auth (for example /auth/anything and /authfoo). This follows from prefix matching of the path value in Allow/Disallow rules [1]. What matches with Disallow: /auth/ - It matches only URL paths that start with /auth/ (note the trailing slash), i.e., it matches /auth/ and subpaths like /auth/login, but it does NOT match /auth (the version without the slash), because that path does not start with /auth/. This is consistent with the example prefix matching behavior and the “longest specific match” rule in RFC 9309 [2]. Practical guidance - If you intend to block a directory and its contents, the standard approach is to include the trailing slash in the directory path pattern (e.g., Disallow: /auth/) [3]. - If you instead use Disallow: /auth, you may accidentally block URLs you didn’t mean to (e.g., /auth.html-like paths or other paths that merely begin with /auth but aren’t under /auth/). This risk is explicitly noted for prefix-style matching behavior in explanations of robots.txt path matching [4]. Answer to your specific comparison - Disallow: /auth blocks both /auth and /auth/… (and also anything else that has prefix /auth). - Disallow: /auth/ blocks /auth/… but does not block /auth (no trailing slash).

Citations:


🏁 Script executed:

cat -n ./next-sitemap.config.js

Repository: BCSDLab/KOIN_WEB_RECODE

Length of output: 2207


🏁 Script executed:

wc -l ./next-sitemap.config.js

Repository: BCSDLab/KOIN_WEB_RECODE

Length of output: 95


robots.txt disallow 설정이 누락되어 크롤러가 비공개 경로에 접근할 수 있습니다.

robots.txt의 disallow 목록(49-65줄)에서 다음 경로들이 누락되었습니다:

  • /store/review/edit/* (PRIVATE_PATHS에는 있음)
  • /monitoring/* (PRIVATE_PATHS에는 있음)

또한 /auth/, /webview/, /report/는 trailing slash 규칙으로만 설정되어 /auth, /webview, /report 정확 경로는 차단되지 않습니다. 크롤러가 이 경로들에 접근할 수 있게 됩니다.

🔧 제안 수정안
         disallow: [
-          '/auth/',
-          '/webview/',
-          '/report/',
+          '/auth',
+          '/webview',
+          '/report',
           '/callvan/add',
-          '/callvan/chat/',
+          '/callvan/chat',
           '/callvan/notifications',
           '/timetable/modify',
           '/clubs/new',
-          '/clubs/edit/',
+          '/clubs/edit',
-          '/clubs/recruitment/edit/',
+          '/clubs/recruitment/edit',
-          '/lost-item/edit/',
-          '/lost-item/report/',
+          '/lost-item/edit',
+          '/lost-item/report',
           '/lost-item/chat',
-          '/store/review/',
+          '/store/review',
+          '/store/review/edit',
+          '/monitoring',
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@next-sitemap.config.js` around lines 43 - 65, The
robotsTxtOptions.policies[0].disallow array is missing specific patterns and
non-trailing-slash variants so crawlers can still reach private routes; update
the disallow list used by robotsTxtOptions in next-sitemap.config.js (and ensure
PRIVATE_PATHS is honored) to add the wildcard patterns '/store/review/edit/*'
and '/monitoring/*' and include both trailing and non-trailing variants for the
listed prefixes (e.g., add '/auth' in addition to '/auth/', '/webview' in
addition to '/webview/', and '/report' in addition to '/report/') so all exact
and nested paths are disallowed by the policy.

Copy link
Copy Markdown
Contributor

@ff1451 ff1451 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

고생하셨습니다!

Comment thread next-sitemap.config.js Outdated
'/lost-item/report',
'/lost-item/chat',
'/store/review',
'/monitoring',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

이 라우트는 어떤 용도인지 알 수 있을까요?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

개인정보랑 인증이 필요한 페이지라서 검색 엔진 크롤링 차단하기 위해 따로 적어놓았습니다!

- 존재하지 않는 /monitoring 경로 제거
- /store/review/edit/* → /store/review/edit/** (2단계 깊이 매치)
- 단독 페이지가 없는 /webview, /report 라인 제거

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

✨ Feature 기능 개발

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants